Topic Models for Translation Quality Estimation for Gisting Purposes

نویسندگان

  • Raphael Rubino
  • José G. C. de Souza
  • Jennifer Foster
  • Lucia Specia
چکیده

This paper addresses the problem of predicting how adequate a machine translation is for gisting purposes. It focuses on the contribution of lexicalised features based on different types of topic models, as we believe these features are more robust than those used in previous work, which depend on linguistic processors that are often unreliable on automatic translations. Experiments with a number of datasets show promising results: the use of topic models outperforms the state-of-the-art approaches by a large margin in all datasets annotated for adequacy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relevance Ranking for Translated Texts

The usefulness of a translated text for gisting purposes strongly depends on the overall translation quality of the text, but especially on the translation quality of the most informative portions of the text. In this paper we address the problems of ranking translated sentences within a document and ranking translated documents within a set of documents on the same topic according to their inf...

متن کامل

QuEst - A translation quality estimation framework

We describe QUEST, an open source framework for machine translation quality estimation. The framework allows the extraction of several quality indicators from source segments, their translations, external resources (corpora, language models, topic models, etc.), as well as language tools (parsers, part-of-speech tags, etc.). It also provides machine learning algorithms to build quality estimati...

متن کامل

BiTAM: Bilingual Topic AdMixture Models for Word Alignment

We propose a novel bilingual topical admixture (BiTAM) formalism for word alignment in statistical machine translation. Under this formalism, the parallel sentence-pairs within a document-pair are assumed to constitute a mixture of hidden topics; each word-pair follows a topic-specific bilingual translation model. Three BiTAM models are proposed to capture topic sharing at different levels of l...

متن کامل

Document-level translation quality estimation: exploring dicsourse an pseudo-references

Predicting the quality of machine translations is a challenging topic. Quality estimation (QE) of translations is based on features of the source and target texts (without the need for human references), and on supervised machine learning methods to build prediction models. Engineering well-performing features is therefore crucial in QE modelling. Several features have been used so far, but the...

متن کامل

On the Translation Quality of Google Translate: With a Concentration on Adjectives

Translation, whose first traces date back at least to 3000 BC (Newmark, 1988), has always been considered time-consuming and labor-consuming. In view of this, experts have made numerous efforts to develop some mechanical systems which can reduce part of this time and labor. The advancement of computers in the second half of the twentieth century paved the ground for the invention of machine tra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013